{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "e98b46dc-6c1a-4767-adeb-b585db6834e1",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Audio Similarity Search\n",
    "In this example we will be going over the code required to perform audio similarity searches. This example uses a the PANNs model to extract audio features that are then used with Milvus to build a system that can perform the searches.\n",
    "\n",
    "A deployable version of a reverse audio search can be found in this directory.\n",
    "\n",
    "## Data\n",
    "\n",
    "This example uses the TUT Acoustic scenes 2017 Evaluation dataset, which contains 1622 10-second audio clips that fall within 15 categories: Bus, Cafe,\n",
    "Car, City center, Forest path, Grocery store,  Home, Lakeside beach, Library, Metro station, Office, Residential area, Train, Tram, and Urban park.\n",
    "\n",
    "Dataset size: ~ 4.29 GB.\n",
    "\n",
    "\n",
    "Directory Structure:  \n",
    "The file loader used in this example requires that all the data be in .wav format due to librosa limitations. The way that files are read also limits the structure to a folder with all the data points. \n",
    "\n",
    "## Requirements\n",
    "\n",
    "|  Packages   |  Servers    |\n",
    "|-                  | -                 |   \n",
    "| pymilvus          | milvus-1.1.0      |\n",
    "| redis             | redis             |\n",
    "| librosa           |\n",
    "| ipython           |\n",
    "| numpy             |\n",
    "| panns_inference   |\n",
    "\n",
    "We have included a requirements.txt file in order to easily satisfy the required packages. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c51e6223-ddaf-41a8-b8fd-b889aa8a9eee",
   "metadata": {},
   "source": [
    "## Up and Running\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4f186717-c273-4c18-af7f-d30af2b87a2f",
   "metadata": {},
   "source": [
    "### Installing Packages\n",
    "Install the required python packages with `requirements.txt`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "c06c0299-c9fe-4563-91d1-e942716dbcb6",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Requirement already satisfied: pymilvus in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from -r requirements.txt (line 1)) (1.1.0)\n",
      "Requirement already satisfied: redis in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from -r requirements.txt (line 2)) (3.5.3)\n",
      "Requirement already satisfied: librosa in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from -r requirements.txt (line 3)) (0.8.0)\n",
      "Requirement already satisfied: ipython in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from -r requirements.txt (line 4)) (7.22.0)\n",
      "Requirement already satisfied: numpy in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from -r requirements.txt (line 5)) (1.20.2)\n",
      "Requirement already satisfied: panns_inference in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from -r requirements.txt (line 6)) (0.0.7)\n",
      "Requirement already satisfied: appnope in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from ipython->-r requirements.txt (line 4)) (0.1.2)\n",
      "Requirement already satisfied: pexpect>4.3 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from ipython->-r requirements.txt (line 4)) (4.8.0)\n",
      "Requirement already satisfied: backcall in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from ipython->-r requirements.txt (line 4)) (0.2.0)\n",
      "Requirement already satisfied: jedi>=0.16 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from ipython->-r requirements.txt (line 4)) (0.18.0)\n",
      "Requirement already satisfied: traitlets>=4.2 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from ipython->-r requirements.txt (line 4)) (5.0.5)\n",
      "Requirement already satisfied: pygments in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from ipython->-r requirements.txt (line 4)) (2.8.1)\n",
      "Requirement already satisfied: decorator in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from ipython->-r requirements.txt (line 4)) (5.0.7)\n",
      "Requirement already satisfied: pickleshare in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from ipython->-r requirements.txt (line 4)) (0.7.5)\n",
      "Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from ipython->-r requirements.txt (line 4)) (3.0.18)\n",
      "Requirement already satisfied: setuptools>=18.5 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from ipython->-r requirements.txt (line 4)) (52.0.0.post20210125)\n",
      "Requirement already satisfied: parso<0.9.0,>=0.8.0 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from jedi>=0.16->ipython->-r requirements.txt (line 4)) (0.8.2)\n",
      "Requirement already satisfied: ptyprocess>=0.5 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from pexpect>4.3->ipython->-r requirements.txt (line 4)) (0.7.0)\n",
      "Requirement already satisfied: wcwidth in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython->-r requirements.txt (line 4)) (0.2.5)\n",
      "Requirement already satisfied: ipython-genutils in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from traitlets>=4.2->ipython->-r requirements.txt (line 4)) (0.2.0)\n",
      "Requirement already satisfied: scipy>=1.0.0 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from librosa->-r requirements.txt (line 3)) (1.6.3)\n",
      "Requirement already satisfied: audioread>=2.0.0 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from librosa->-r requirements.txt (line 3)) (2.1.9)\n",
      "Requirement already satisfied: soundfile>=0.9.0 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from librosa->-r requirements.txt (line 3)) (0.10.3.post1)\n",
      "Requirement already satisfied: scikit-learn!=0.19.0,>=0.14.0 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from librosa->-r requirements.txt (line 3)) (0.24.2)\n",
      "Requirement already satisfied: numba>=0.43.0 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from librosa->-r requirements.txt (line 3)) (0.53.1)\n",
      "Requirement already satisfied: pooch>=1.0 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from librosa->-r requirements.txt (line 3)) (1.3.0)\n",
      "Requirement already satisfied: joblib>=0.14 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from librosa->-r requirements.txt (line 3)) (1.0.1)\n",
      "Requirement already satisfied: resampy>=0.2.2 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from librosa->-r requirements.txt (line 3)) (0.2.2)\n",
      "Requirement already satisfied: llvmlite<0.37,>=0.36.0rc1 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from numba>=0.43.0->librosa->-r requirements.txt (line 3)) (0.36.0)\n",
      "Requirement already satisfied: appdirs in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from pooch>=1.0->librosa->-r requirements.txt (line 3)) (1.4.4)\n",
      "Requirement already satisfied: packaging in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from pooch>=1.0->librosa->-r requirements.txt (line 3)) (20.9)\n",
      "Requirement already satisfied: requests in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from pooch>=1.0->librosa->-r requirements.txt (line 3)) (2.25.1)\n",
      "Requirement already satisfied: six>=1.3 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from resampy>=0.2.2->librosa->-r requirements.txt (line 3)) (1.15.0)\n",
      "Requirement already satisfied: threadpoolctl>=2.0.0 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from scikit-learn!=0.19.0,>=0.14.0->librosa->-r requirements.txt (line 3)) (2.1.0)\n",
      "Requirement already satisfied: cffi>=1.0 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from soundfile>=0.9.0->librosa->-r requirements.txt (line 3)) (1.14.5)\n",
      "Requirement already satisfied: pycparser in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from cffi>=1.0->soundfile>=0.9.0->librosa->-r requirements.txt (line 3)) (2.20)\n",
      "Requirement already satisfied: matplotlib in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from panns_inference->-r requirements.txt (line 6)) (3.4.1)\n",
      "Requirement already satisfied: torchlibrosa in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from panns_inference->-r requirements.txt (line 6)) (0.0.9)\n",
      "Requirement already satisfied: ujson>=2.0.0 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from pymilvus->-r requirements.txt (line 1)) (4.0.2)\n",
      "Requirement already satisfied: grpcio-tools>=1.22.0 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from pymilvus->-r requirements.txt (line 1)) (1.37.0)\n",
      "Requirement already satisfied: grpcio>=1.22.0 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from pymilvus->-r requirements.txt (line 1)) (1.37.0)\n",
      "Requirement already satisfied: protobuf<4.0dev,>=3.5.0.post1 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from grpcio-tools>=1.22.0->pymilvus->-r requirements.txt (line 1)) (3.15.8)\n",
      "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from requests->pooch>=1.0->librosa->-r requirements.txt (line 3)) (1.26.4)\n",
      "Requirement already satisfied: chardet<5,>=3.0.2 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from requests->pooch>=1.0->librosa->-r requirements.txt (line 3)) (4.0.0)\n",
      "Requirement already satisfied: certifi>=2017.4.17 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from requests->pooch>=1.0->librosa->-r requirements.txt (line 3)) (2020.12.5)\n",
      "Requirement already satisfied: idna<3,>=2.5 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from requests->pooch>=1.0->librosa->-r requirements.txt (line 3)) (2.10)\n",
      "Requirement already satisfied: pyparsing>=2.2.1 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from matplotlib->panns_inference->-r requirements.txt (line 6)) (2.4.7)\n",
      "Requirement already satisfied: python-dateutil>=2.7 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from matplotlib->panns_inference->-r requirements.txt (line 6)) (2.8.1)\n",
      "Requirement already satisfied: pillow>=6.2.0 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from matplotlib->panns_inference->-r requirements.txt (line 6)) (8.2.0)\n",
      "Requirement already satisfied: kiwisolver>=1.0.1 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from matplotlib->panns_inference->-r requirements.txt (line 6)) (1.3.1)\n",
      "Requirement already satisfied: cycler>=0.10 in /Users/filiphaltmayer/opt/miniconda3/envs/coinbase_demo/lib/python3.8/site-packages (from matplotlib->panns_inference->-r requirements.txt (line 6)) (0.10.0)\n"
     ]
    }
   ],
   "source": [
    "! pip install -r requirements.txt"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "64aa57b8-4f03-44d6-b58a-cca8bd6cb34f",
   "metadata": {},
   "source": [
    "### Starting Milvus Server\n",
    "\n",
    "This demo uses Milvus 1.1.0, please refer to the [Install Milvus](https://milvus.io/docs/v1.1.0/install_milvus.md) guide to learn how to use this docker container. For this example we wont be mapping any local volumes. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "7ef0f215-3027-49c2-bd22-8d0c7593ffb1",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "docker: Error response from daemon: Conflict. The container name \"/milvus_cpu_1.1.0\" is already in use by container \"4fb8cec9122862bcb864b6b782796d32dacfd0b8489bd9666224222cde746485\". You have to remove (or rename) that container to be able to reuse that name.\n",
      "See 'docker run --help'.\n"
     ]
    }
   ],
   "source": [
    "! docker run --name milvus_cpu_1.1.0 -d \\\n",
    "-p 19530:19530 \\\n",
    "-p 19121:19121 \\\n",
    "milvusdb/milvus:1.1.0-cpu-d050721-5e559c"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "39aae31a-e680-4d14-8510-4a1630c2950f",
   "metadata": {},
   "source": [
    "### Starting Redis Server\n",
    "\n",
    "We are using Redis as a metadata storage service for this example. Code can easily be modified to use a python dictionary, but that usually does not work in any use case outside of quick examples. We need a metadata storage service in order to be able to be able to map between embeddings and their corresponding audio clips."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "de55aa17-32ec-4031-ab99-136ff38c4789",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "docker: Error response from daemon: Conflict. The container name \"/redis\" is already in use by container \"0e45df4657c651586ae5c80d0db1605206415a0bcb101573b690a434b5a4f7e8\". You have to remove (or rename) that container to be able to reuse that name.\n",
      "See 'docker run --help'.\n"
     ]
    }
   ],
   "source": [
    "! docker run --name redis -d -p 6379:6379 redis"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "36365303-d035-48bd-9c7c-de42db4afba5",
   "metadata": {},
   "source": [
    "### Confirm Running Servers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "4235a4fd-ea7c-4e02-b367-9ca3f31fca90",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "    __  _________ _   ____  ______    \n",
      "   /  |/  /  _/ /| | / / / / / __/    \n",
      "  / /|_/ // // /_| |/ / /_/ /\\ \\    \n",
      " /_/  /_/___/____/___/\\____/___/     \n",
      "\n",
      "Welcome to use Milvus!\n",
      "Milvus Release version: v1.1.0, built at 2021-05-06 14:50.43, with OpenBLAS library.\n",
      "You are using Milvus CPU edition\n",
      "Last commit id: 5e559cd7918297bcdb55985b80567cb6278074dd\n",
      "\n",
      "Loading configuration from: /var/lib/milvus/conf/server_config.yaml\n",
      "WARNNING: You are using SQLite as the meta data management, which can't be used in production. Please change it to MySQL!\n",
      "Supported CPU instruction sets: avx2, sse4_2\n",
      "FAISS hook AVX2\n",
      "Milvus server started successfully!\n"
     ]
    }
   ],
   "source": [
    "! docker logs milvus_cpu_1.1.0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "c7eb3eca-0dfd-4b75-a8db-f0cd67410d66",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1:C 18 May 2021 20:22:25.046 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo\n",
      "1:C 18 May 2021 20:22:25.046 # Redis version=6.2.1, bits=64, commit=00000000, modified=0, pid=1, just started\n",
      "1:C 18 May 2021 20:22:25.046 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf\n",
      "1:M 18 May 2021 20:22:25.047 * monotonic clock: POSIX clock_gettime\n",
      "1:M 18 May 2021 20:22:25.047 * Running mode=standalone, port=6379.\n",
      "1:M 18 May 2021 20:22:25.048 # Server initialized\n",
      "1:M 18 May 2021 20:22:25.048 * Ready to accept connections\n",
      "1:M 18 May 2021 21:19:11.535 * 100 changes in 300 seconds. Saving...\n",
      "1:M 18 May 2021 21:19:11.537 * Background saving started by pid 20\n",
      "20:C 18 May 2021 21:19:11.541 * DB saved on disk\n",
      "20:C 18 May 2021 21:19:11.542 * RDB: 0 MB of memory used by copy-on-write\n",
      "1:M 18 May 2021 21:19:11.638 * Background saving terminated with success\n"
     ]
    }
   ],
   "source": [
    "! docker logs redis"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2cdfde3c-3475-4791-9ad9-08848fdc8086",
   "metadata": {},
   "source": [
    "### Downloading Data\n",
    "These commands download and unzip the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f2530850-db5f-4bb0-98df-8ef177576d4d",
   "metadata": {},
   "outputs": [],
   "source": [
    "!wget -O 'file1.zip' 'https://zenodo.org/record/1040168/files/TUT-acoustic-scenes-2017-evaluation.audio.1.zip?download=1' -q --show-progress\n",
    "!wget -O 'file2.zip' 'https://zenodo.org/record/1040168/files/TUT-acoustic-scenes-2017-evaluation.audio.2.zip?download=1' -q --show-progress\n",
    "!wget -O 'file3.zip' 'https://zenodo.org/record/1040168/files/TUT-acoustic-scenes-2017-evaluation.audio.3.zip?download=1' -q --show-progress\n",
    "!wget -O 'file4.zip' 'https://zenodo.org/record/1040168/files/TUT-acoustic-scenes-2017-evaluation.audio.4.zip?download=1' -q --show-progress\n",
    "\n",
    "!tar -xf file1.zip \n",
    "!tar -xf file2.zip \n",
    "!tar -xf file3.zip \n",
    "!tar -xf file4.zip \n",
    "!rm 'file1.zip' 'file2.zip' 'file3.zip' 'file4.zip'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4edb297f-3e0a-4891-aeda-61d6d473bbad",
   "metadata": {},
   "source": [
    "## Code Overview"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "755dac2b-c3c6-4735-80e5-606824a9a806",
   "metadata": {},
   "source": [
    "### Connecting to Servers\n",
    "We first start off by connecting to the servers. In this case the docker containers are running on localhost and the ports are the default ports. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "3f9663e1-cb68-4a49-a932-59c6a677e4ad",
   "metadata": {},
   "outputs": [],
   "source": [
    "#Connectings to Milvus and Redis\n",
    "import redis\n",
    "import milvus\n",
    "\n",
    "milv = milvus.Milvus(host = '127.0.0.1', port = 19530)\n",
    "red = redis.Redis(host = '127.0.0.1', port=6379, db=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fb367f99-232a-4f5e-a673-543c33ec5590",
   "metadata": {},
   "source": [
    "### Building Collection and Setting Index\n",
    "\n",
    "The next step involves creating a collection. A collection in Milvus is similar to a table in a relational database, and is used for storing all the vectors. To create a collection, we first must select a name, the dimension of the vectors being stored within, the index_file_size, and metric_type. The index_file_size corresponds to how large each data segmet will be within the collection. More information on this can be found here. The metric_type is the distance formula being used to calculate similarity. In this example we are using the Euclidean distance. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "dee0ac6a-023f-44ba-a9ec-2417e6b5e660",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Status(code=0, message='Create collection successfully!')\n"
     ]
    }
   ],
   "source": [
    "#Creating collection\n",
    "\n",
    "import time\n",
    "\n",
    "collection_name = \"audio_collection\"\n",
    "milv.drop_collection(collection_name) \n",
    "red.flushdb()\n",
    "time.sleep(.1)\n",
    "\n",
    "collection_param = {\n",
    "            'collection_name': collection_name,\n",
    "            'dimension': 2048,\n",
    "            'index_file_size': 1024,  # optional\n",
    "            'metric_type': milvus.MetricType.L2  # optional\n",
    "            }\n",
    "\n",
    "status, ok = milv.has_collection(collection_name)\n",
    "\n",
    "if not ok:\n",
    "    status = milv.create_collection(collection_param)\n",
    "    print(status)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "be80b84f-9b73-4d10-b2f5-93aa0f709410",
   "metadata": {},
   "source": [
    "After creating the collection we want to assign it an index type. This can be done before or after inserting the data. When done before, indexes will be made as data comes in and fills the data segments. In this example we are using IVF_SQ8 which requires the 'nlist' parameter. Each index types carries its own parameters. More info about this param can be found [here](https://milvus.io/docs/v1.0.0/index.md#CPU)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "9f0ad6ed-7555-4202-8219-bf7232af9c51",
   "metadata": {},
   "outputs": [],
   "source": [
    "#Indexing collection\n",
    "\n",
    "index_param = {\n",
    "    'nlist': 512\n",
    "}\n",
    "\n",
    "status = milv.create_index(collection_name, milvus.IndexType.IVF_SQ8, index_param)\n",
    "status, index = milv.get_index_info(collection_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a6c9b019-913d-4756-9c45-dbfd6b16d76c",
   "metadata": {},
   "source": [
    "### Processing and Storing Audio Files\n",
    "In order to store the audio tracks in Milvus, we must first get the embeddings. To do this, we start by loading the audio file using Librosa. Once we have the audio clip loaded we can pass it to the PANN model. In this case we are using the panns_inference library to simplfy the importing and processing. Once we recieve the embedding we can push it into Milvus and store each uniqueID and filepath combo into redis. We do this so that we can later access the audio file when displaying the results. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "40cea7cf-7c65-41ac-b61f-81e27f0ea708",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Checkpoint path: /Users/filiphaltmayer/panns_data/Cnn14_mAP=0.431.pth\n",
      "Using CPU.\n",
      "Starting Insert\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "import librosa\n",
    "import numpy as np\n",
    "from panns_inference import SoundEventDetection, labels, AudioTagging\n",
    "\n",
    "data_dir = './TUT-acoustic-scenes-2017-evaluation/audio'\n",
    "at = AudioTagging(checkpoint_path=None, device='cpu')\n",
    "\n",
    "def embed_and_save(path, at):\n",
    "    audio, _ = librosa.core.load(path, sr=32000, mono=True)\n",
    "    audio = audio[None, :]\n",
    "    try:\n",
    "        _, embedding = at.inference(audio)\n",
    "        embedding = embedding/np.linalg.norm(embedding)\n",
    "        status, ids = milv.insert(collection_name=collection_name, records=embedding)\n",
    "        if not status.OK():\n",
    "            print(\"Insert failed: {}\".format(status))\n",
    "        else:\n",
    "            red.set(str(ids[0]), path)\n",
    "    except:\n",
    "        print(\"failed: \" + path)\n",
    "\n",
    "\n",
    "print(\"Starting Insert\")\n",
    "for subdir, dirs, files in os.walk(data_dir):\n",
    "    for file in files:\n",
    "        path = os.path.join(subdir, file)\n",
    "        embed_and_save(path, at)\n",
    "print(\"Insert Done\")\n",
    "        \n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f6c737f3-7849-41e5-a3ec-6a0a7ddb6242",
   "metadata": {},
   "source": [
    "### Searching\n",
    "In this example we perform a search on a few randomly selected audio clips. In order to perform the search we must first apply the same processing that was done on the original audio clips. This will result in us having a set of embeddings."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "11cf4563-2d9f-43c6-aba7-598a575dbabe",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "def get_embed(paths, at):\n",
    "    embedding_list = []\n",
    "    for x in paths:\n",
    "        audio, _ = librosa.core.load(x, sr=32000, mono=True)\n",
    "        audio = audio[None, :]\n",
    "        try:\n",
    "            _, embedding = at.inference(audio)\n",
    "            embedding = embedding/np.linalg.norm(embedding)\n",
    "            embedding_list.append(embedding)\n",
    "        except:\n",
    "            print(\"Embedding Failed: \" + x)\n",
    "    return np.array(embedding_list, dtype=np.float32).squeeze()\n",
    "#     return embedding_list\n",
    "\n",
    "random_ids = [int(red.randomkey()) for x in range(3)]\n",
    "search_clips = [x.decode(\"utf-8\") for x in red.mget(random_ids)]\n",
    "embeddings = get_embed(search_clips, at)\n",
    "print(embeddings.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3c62208c-dfad-4686-9434-9dce8410d653",
   "metadata": {},
   "source": [
    "We can then take these embeddings and perform a search. The search requires a few arguments: the name of the collection, the vectors being searched for, how many closest vectors to be returned, and the parameters for the index, in this case nprobe. Once performed this example will return the searched clip and the result clips. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e1bca19c-7ca2-43d2-ae42-03fd5b5efc06",
   "metadata": {},
   "outputs": [],
   "source": [
    "import IPython.display as ipd\n",
    "\n",
    "def show_results(query, results, distances):\n",
    "    print(\"Query: \")\n",
    "    ipd.display(ipd.Audio(query))\n",
    "    print(\"Results: \")\n",
    "    for x in range(len(results)):\n",
    "        print(\"Distance: \" + str(distances[x]))\n",
    "        ipd.display(ipd.Audio(results[x]))\n",
    "    print(\"-\"*50)\n",
    "\n",
    "print(embeddings.shape)\n",
    "search_sub_param = {\n",
    "        \"nprobe\": 16\n",
    "    }\n",
    "\n",
    "search_param = {\n",
    "    'collection_name': collection_name,\n",
    "    'query_records': embeddings,\n",
    "    'top_k': 3,\n",
    "    'params': search_sub_param,\n",
    "    }\n",
    "\n",
    "start = time.time()\n",
    "status, results = milv.search(**search_param)\n",
    "end = time.time() - start\n",
    "\n",
    "print(\"Search took a total of: \", end)\n",
    "\n",
    "if status.OK():\n",
    "    for x in range(len(results)):\n",
    "        query_file = search_clips[x]\n",
    "        result_files = [red.get(y.id).decode('utf-8') for y in results[x]]\n",
    "        distances = [y.distance for y in results[x]]\n",
    "        show_results(query_file, result_files, distances)\n",
    "else:\n",
    "    print(\"Search Failed.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "16f7207e-1ca8-4aea-a4f2-eb0bb4111f82",
   "metadata": {},
   "source": [
    "## Conclusion\n",
    "This notebook shows how to search for similar audio clips. \n",
    "\n",
    "Check out our [demo system](https://zilliz.com/milvus-demos) to try out different solutions. "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
